Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Text classification model combining word annotations
Xianfeng YANG, Jiahe ZHAO, Ziqiang LI
Journal of Computer Applications    2022, 42 (5): 1317-1323.   DOI: 10.11772/j.issn.1001-9081.2021030489
Abstract313)   HTML31)    PDF (1662KB)(283)       Save

The traditional text feature representation method cannot fully solve the polysemy problem of word. In order to solve the problem, a new text classification model combining word annotations was proposed. Firstly, by using the existing Chinese dictionary, the dictionary annotations of the text selected by the word context were obtained, and the Bidirectional Encoder Representations from Transformers (BERT) encoding was performed on them to generate the annotated sentence vectors. Then, the annotated sentence vectors were integrated with the word embedding vectors as the input layer to enrich the characteristic information of the input text. Finally, the Bidirectional Gated Recurrent Unit (BiGRU) was used to learn the characteristic information of the input text, and the attention mechanism was introduced to highlight the key feature vectors. Experimental results of text classification on public THUCNews dataset and Sina weibo sentiment classification dataset show that, the text classification models combining BERT word annotations have significantly improved performance compared to the text classification models without combining word annotations, the proposed BERT word annotation _BiGRU_Attention model has the highest precision and recall in all the experimental models for text classification, and has the F1-Score of reflecting the overall performance up to 98.16% and 96.52% respectively.

Table and Figures | Reference | Related Articles | Metrics